fgsm attack
Protecting the Neural Networks against FGSM Attack Using Machine Unlearning
Khorasani, Amir Hossein, Jahanian, Ali, Rastgarpour, Maryam
Machine learning is a powerful tool for building predictive models. However, it is vulnerable to adversarial attacks. Fast Gradient Sign Method (FGSM) attacks are a common type of adversarial attack that adds small perturbations to input data to trick a model into misclassifying it. In response to these attacks, researchers have developed methods for "unlearning" these attacks, which involves retraining a model on the original data without the added perturbations. Machine unlearning is a technique that tries to "forget" specific data points from the training dataset, to improve the robustness of a machine learning model against adversarial attacks like FGSM. In this paper, we focus on applying unlearning techniques to the LeNet neural network, a popular architecture for image classification. We evaluate the efficacy of unlearning FGSM attacks on the LeNet network and find that it can significantly improve its robustness against these types of attacks.
Analysis of the vulnerability of machine learning regression models to adversarial attacks using data from 5G wireless networks
Legashev, Leonid, Zhigalov, Artur, Parfenov, Denis
This article describes the process of creating a script and conducting an analytical study of a dataset using the DeepMIMO emulator. An advertorial attack was carried out using the FGSM method to maximize the gradient. A comparison is made of the effectiveness of binary classifiers in the task of detecting distorted data. The dynamics of changes in the quality indicators of the regression model were analyzed in conditions without adversarial attacks, during an adversarial attack and when the distorted data was isolated. It is shown that an adversarial FGSM attack with gradient maximization leads to an increase in the value of the MSE metric by 33% and a decrease in the R2 indicator by 10% on average. The LightGBM binary classifier effectively identifies data with adversarial anomalies with 98% accuracy. Regression machine learning models are susceptible to adversarial attacks, but rapid analysis of network traffic and data transmitted over the network makes it possible to identify malicious activity
The Dark Side of Digital Twins: Adversarial Attacks on AI-Driven Water Forecasting
Homaei, Mohammadhossein, Morales, Victor Gonzalez, Mogollon-Gutierrez, Oscar, Caro, Andres
Digital twins (DTs) are improving water distribution systems by using real-time data, analytics, and prediction models to optimize operations. This paper presents a DT platform designed for a Spanish water supply network, utilizing Long Short-Term Memory (LSTM) networks to predict water consumption. However, machine learning models are vulnerable to adversarial attacks, such as the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD). These attacks manipulate critical model parameters, injecting subtle distortions that degrade forecasting accuracy. To further exploit these vulnerabilities, we introduce a Learning Automata (LA) and Random LA-based approach that dynamically adjusts perturbations, making adversarial attacks more difficult to detect. Experimental results show that this approach significantly impacts prediction reliability, causing the Mean Absolute Percentage Error (MAPE) to rise from 26% to over 35%. Moreover, adaptive attack strategies amplify this effect, highlighting cybersecurity risks in AI-driven DTs. These findings emphasize the urgent need for robust defenses, including adversarial training, anomaly detection, and secure data pipelines.
A Multi-Scale Isolation Forest Approach for Real-Time Detection and Filtering of FGSM Adversarial Attacks in Video Streams of Autonomous Vehicles
Abhulimhen, Richard, Begashaw, Negash, Comert, Gurcan, Zhao, Chunheng, Pisu, Pierluigi
Deep Neural Networks (DNNs) have demonstrated remarkable success across a wide range of tasks, particularly in fields such as image classification. However, DNNs are highly susceptible to adversarial attacks, where subtle perturbations are introduced to input images, leading to erroneous model outputs. In today's digital era, ensuring the security and integrity of images processed by DNNs is of critical importance. One of the most prominent adversarial attack methods is the Fast Gradient Sign Method (FGSM), which perturbs images in the direction of the loss gradient to deceive the model. This paper presents a novel approach for detecting and filtering FGSM adversarial attacks in image processing tasks. Our proposed method evaluates 10,000 images, each subjected to five different levels of perturbation, characterized by $\epsilon$ values of 0.01, 0.02, 0.05, 0.1, and 0.2. These perturbations are applied in the direction of the loss gradient. We demonstrate that our approach effectively filters adversarially perturbed images, mitigating the impact of FGSM attacks. The method is implemented in Python, and the source code is publicly available on GitHub for reproducibility and further research.
Adversarial Machine Learning: Attacking and Safeguarding Image Datasets
This paper examines the vulnerabilities of convolutional neural networks (CNNs) to adversarial attacks and explores a method for their safeguarding. In this study, CNNs were implemented on four of the most common image datasets, namely CIFAR-10, ImageNet, MNIST, and Fashion-MNIST, and achieved high baseline accuracy. To assess the strength of these models, the Fast Gradient Sign Method was used, which is a type of exploit on the model that is used to bring down the models accuracies by adding a very minimal perturbation to the input image. To counter the FGSM attack, a safeguarding approach went through, which includes retraining the models on clear and pollutant or adversarial images to increase their resistance ability. The next step involves applying FGSM again, but this time to the adversarially trained models, to see how much the accuracy of the models has gone down and evaluate the effectiveness of the defense. It appears that while most level of robustness is achieved against the models after adversarial training, there are still a few losses in the performance of these models against adversarial perturbations. This work emphasizes the need to create better defenses for models deployed in real-world scenarios against adversaries.
Exploring Secure Machine Learning Through Payload Injection and FGSM Attacks on ResNet-50
Yadav, Umesh, Niraula, Suman, Gupta, Gaurav Kumar, Yadav, Bicky
As ML models continue to integrate into critical cybersecurity In the modern landscape of cybersecurity, machine learning systems, the ability to exploit these models through (ML) models, especially in areas like image classification, adversarial techniques poses significant threats. A study predicts are increasingly integrated into systems where robustness and that by 2025, 30% of cyberattacks will involve adversarial security are paramount. However, these models are highly machine-learning tactics[16]. Pre-trained models are susceptible to adversarial attacks, where small, crafted perturbations susceptible to perturbations adversarial attacks, which can can lead to incorrect predictions and, in more severe undermine trust in AI systems due to the lack of customized cases, unauthorized access or manipulation of systems[2].
Is AI Robust Enough for Scientific Research?
Zhang, Jun-Jie, Song, Jiahao, Wang, Xiu-Cheng, Li, Fu-Peng, Liu, Zehan, Chen, Jian-Nan, Dang, Haoning, Wang, Shiyao, Zhang, Yiyan, Xu, Jianhui, Shi, Chunxiang, Wang, Fei, Pang, Long-Gang, Cheng, Nan, Zhang, Weiwei, Zhang, Duo, Meng, Deyu
Artificial Intelligence (AI) has become a transformative tool in scientific research, driving breakthroughs across numerous disciplines [5-11]. Despite these achievements, neural networks, which form the backbone of many AI systems, exhibit significant vulnerabilities. One of the most concerning issues is their susceptibility to adversarial attacks [1, 2, 12, 13]. These attacks involve making small, often imperceptible changes to the input data, causing AI systems to make incorrect predictions (Figure 1), highlighting a critical weakness: AI systems can fail under minimal perturbations - a phenomenon completely unseen in classical methods. The impact of adversarial attacks has been extensively studied in the context of image classification [14-16].
Impact of White-Box Adversarial Attacks on Convolutional Neural Networks
Podder, Rakesh, Ghosh, Sudipto
Autonomous vehicle navigation and healthcare diagnostics are among the many fields where the reliability and security of machine learning models for image data are critical. We conduct a comprehensive investigation into the susceptibility of Convolutional Neural Networks (CNNs), which are widely used for image data, to white-box adversarial attacks. We investigate the effects of various sophisticated attacks -- Fast Gradient Sign Method, Basic Iterative Method, Jacobian-based Saliency Map Attack, Carlini & Wagner, Projected Gradient Descent, and DeepFool -- on CNN performance metrics, (e.g., loss, accuracy), the differential efficacy of adversarial techniques in increasing error rates, the relationship between perceived image quality metrics (e.g., ERGAS, PSNR, SSIM, and SAM) and classification performance, and the comparative effectiveness of iterative versus single-step attacks. Using the MNIST, CIFAR-10, CIFAR-100, and Fashio_MNIST datasets, we explore the effect of different attacks on the CNNs performance metrics by varying the hyperparameters of CNNs. Our study provides insights into the robustness of CNNs against adversarial threats, pinpoints vulnerabilities, and underscores the urgent need for developing robust defense mechanisms to protect CNNs and ensuring their trustworthy deployment in real-world scenarios.
S-E Pipeline: A Vision Transformer (ViT) based Resilient Classification Pipeline for Medical Imaging Against Adversarial Attacks
S, Neha A, Chaturvedi, Vivek, Shafique, Muhammad
--Vision Transformer (ViT) is becoming widely popular in automating accurate disease diagnosis in medical imaging owing to its robust self-attention mechanism. However, ViTs remain vulnerable to adversarial attacks that may thwart the diagnosis process by leading it to intentional misclassification of critical disease. In this paper, we propose a novel image classification pipeline, namely, S-E Pipeline, that performs multiple pre-processing steps that allow ViT to be trained on critical features so as to reduce the impact of input perturbations by adversaries. Our method uses a combination of segmentation and image enhancement techniques such as Contrast Limited Adaptive Histogram Equalization (CLAHE), Unsharp Masking (UM), and High-Frequency Emphasis filtering (HFE) as pre-processing steps to identify critical features that remain intact even after adversarial perturbations. The experimental study demonstrates that our novel pipeline helps in reducing the effect of adversarial attacks by 72.22% for the ViT -b32 model and 86.58% for the ViT -l32 model. Furthermore, we have shown an end-to-end deployment of our proposed method on the NVIDIA Jetson Orin Nano board to demonstrate its practical use case in modern hand-held devices that are usually resource-constrained. Index T erms--vision transformers, medical imaging, adversarial attacks, defense mechanisms, image enhancement, segmentation Deep learning models are becoming pervasive in automating the diagnosis process in medical imaging.
Exploring DNN Robustness Against Adversarial Attacks Using Approximate Multipliers
Askarizadeh, Mohammad Javad, Farahmand, Ebrahim, Castro-Godinez, Jorge, Mahani, Ali, Cabrera-Quiros, Laura, Salazar-Garcia, Carlos
Deep Neural Networks (DNNs) have advanced in many real-world applications, such as healthcare and autonomous driving. However, their high computational complexity and vulnerability to adversarial attacks are ongoing challenges. In this letter, approximate multipliers are used to explore DNN robustness improvement against adversarial attacks. By uniformly replacing accurate multipliers for state-of-the-art approximate ones in DNN layer models, we explore the DNNs robustness against various adversarial attacks in a feasible time. Results show up to 7% accuracy drop due to approximations when no attack is present while improving robust accuracy up to 10% when attacks applied.